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ABSTRACT 

The performance of the following four methodologies 
for assessing unidimensional ity was examined: (1) DIMTEST; (2) the 
approach of P. W. Holland and P. R. Rosenbaum; (3) linear factor 
analysis; and (4) non-linear factor analysis* Each method is examined 
and compared with other methods using simulated data sets and real 
data sets. Seven data sets, all with 2,000 examinees, were generated 
with 3 unidimensional and 4 2-dimens i onal data sets* Tv?o levels of 
correlation between abilities were considered: p=0.3 and p=0*7* Eight 
real data sets were used; four were expected to be unidimensional, 
and the other four were expected to be two-dimensional* Findings 
suggest that, while the linear factor analysis often overestimated 
the number of underlying dimensions, the other three methods 
correctly confirmed unidimensional ity, but differed in their ability 
to detect the lack of unidimens ional i ty . DIMTEST showed excellent 
power in detecting the lack of unidimensionality* Holland and 
Rosenbaum* s approach and non-linear factor analysis approaches showed 
good power, provided the correlation between abilities was low* Four 
tables present study data, and there is a 46-item list of references* 
(Author/SLD) 
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Assessing Dimeusionality of a Set of Items — Comparison of Different Approaches 



Abstract 

This study examines the performance of the following foti,r methodologies for 
assessing uni dimensionality: DIMTEST, Holland and Rosenbaum's approach, linear factor 
analysis, and nonlinear factor analysis. Each method is examined and compared with other 
methods on simulated data sets and on real data sets. Seven data sets, all with 2000 
examinees, were generated: three unidimensional, and four two-dimensional data sets. Two 
levels of corrdatioti between abilities were considered: p-.Z and p=.7. Eight different real 
data sets were used: four of them were expected to be unidimensional, and the other four 
were expected to be two-dimensional. Findings suggest that, while the linear factor 
analysis often ovt.<iStimated the number of underlying dimensions, the other three methods 
correctly confirmed unidimensionality but differed in their ability to detect lack of 
unidimensionality. DIMTEST showed excellent power in detecting lack of 
unidimensionality; Holland and Rosenbaum's and nonlinear factor analysis approaches 
showed good power, provided the correlation between abilities was low. 



Subject terms: DIMTEST, unidimensionality, essential dimensionality, non-linear factor 
analysis, item response theory. 
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Assessing Dimensionality-Comparison 



It is well known that most item response theory (IRT) models require the 
assumption of unidimensionality. According to Lord and Novick (1968), dimensionality is 
defined as the total number of abilities required to satisfy the assumption of local 
independence. If there is only one ability affecting the responses of a set of items to meet 
the assumption of local independence, then that set is referred to as a unidimensional set. 
It has also been long argued that responses to test items are multiply determined 
(Humphreys, 1981, 1985, 1986; Hambleton & Swaminathan, 1985, chap. 2; Reckase, 1979, 
1985; Stout, 1987;*Traub, 1983; Yen, 1985), and several abilities unique to items or 
common to relatively few items are inevitable. The ability which the test is intended to 
measure (i.e., the ability common to all items) will be referred to as the dominant ability, 
and abilities unique to or influencing responses to few items will be referred to as minor 
abilities. Given that item responses are multiply determined, it is intuitively clear that, in 
order to satisfy the assumption of unidimensionality, it is required that a given test 
measure a single dominant ability. A number of simulation studies have demonstrated that 
a dominant ability can be recovered well, using computer programs such as LOGIST, in 
the presence of several minor factors (Reckase, 1979; Drasgow & Parsons, 1983; Harrison, 
1986). Although coimting only dominant dimensions violates Lord and Novick' s (1968) 
definition of dimensionality, it is commonly accepted that, in order to apply 
unidimensional item response theory models, it is sufficient to show that there is one 
dominant ability underlying the responses to a set of items^. 

Stout (1987, 1990) provided a mathematically rigorous definition of dominant 
dimensionality referred to as essential dimensionality and provided a statistical test 
(DIMTEST) to assess whether a set of items met the requirement for essential 
unidimensionality. Junker (1988, 1991) further explored essential dimensionality for 
dichotomous and polytomous items and established consistency results for the maximum 
likelihood ability estimates of 9 under essential unidimensionality. Essential dimensionality 
is the total number of abilities required to satisfy the assumption of essential independence. 



Assessing Dimensionality— Comparison 



An item pool is said to be essentially independent {EI) with respect to the latent variable 
vector ^ if, for a given subset of items, the average absolute conditional (on Q) covariances 
of responses to item pairs approaches zero as the length of the subset increases. When 
conditional covariances based on only one dominant ability meet the assumption of 
essential independence, the response data is said to be essentially unidimensional (i^l). 
In contrast, the assumption of local independence requires that the conditional covariances 
be zero for responses to any item pair, and the number of abilities required to those 
conditional covariances is the dimensionality. According to this definition of 
dimensionality, all major and minor abilities influencing item responses have to be 
considered when assessing the local independence assumption; whereas, according to the 
essential dimensionality, it is sufficient to consider only the influence of dominant abilities. 
Hence, essential independence and essential dimensionality are weaker forms of local 
independence and traditional dimensionality respectively. 

Stout's definition of essential dimensionality is conceptually based on an infinite 
item pool. An infinite item pool can be conceptualized in two ways: 1. as a consequence of 
continuing the test construction process beyond the AT items of the test being studied where 
the AT items become a subset of the item pool; 2. as a consequence of a sequence of finite 
tests where each finite test is optimally constructed. For example, a 20-4tem test is 
constructed with the knowledge that the test is going to be only 20 items long and that it is 
not necessarily a subset cf an optimal 40-4tem test. In this way, an item pool is a collection 
of opti^aal finite test length tests (for details see Junker, 1991; Junker & Stout, 1991). 

In assessing essential unidimensionality of given item responses, DIMTEST assesses 
the likelihood that the given set of item responses come from an essentially unidimensional 
item pool. That is, DIMTEST assesses whether or not the model generating the given item 
responses is close to the EI, 1 model. The major focus in assessing essential 
unidimensionality of a given set of item responses is to determine how "minor" the 
influence of minor abilities is and whether the influence of these minor abilities can be 



ERLC 



3 

6 



Assessing Dimensionaiity-Comparison 



ignored when assessing essential unidimensionality. 

Historically speaking, linear factor analysis has been used to assess the 
dimensionality of the latent space underlying the responses to a set of items. If the results 
indicate a one-factor solution, then it can be inferred that one dominant ability is 
influencing item responses. There are, however, a number of technical as well as 
methodological problems associated with using linear factor analyses to assess 
dimensionality. For example, difficulty levels of items and guessing levels of 
multiple-choice items can each play a major role in affecting the factor structure of item 
responses (for details see Carroll, 1945; Hulin, Drasgow, & Parsons, 1983, chap. 8; Zwick, 
1987). Consequently, many attempts have been made by researchers in recent years to 
develop new methods to assess dimensionality. Some of the recently developed methods 
include nonlinear factor analysis (McDonald & Ahlawat, 1974); Bejar's procedure (Bejar, 
1980); order analysis (Wise, 1981); modified parallel analysis (Hulin, Drasgow, & Parsons, 
1983, p. 255); residual analysis (Hambleton & Swaminathan, 1985. p. 163); Bock's full 
information factor analysis (Bock, Gibbons, & Muraki, 1985); Holland and Rosenbaum's 
test of unidimensional?.ty, monotonicity, and conditional independence (Rosenbaum, 1984; 
Holland & Rosenbaum, 1986); Roznowski, Tucker, and Humphreys' procedures (1991); and 
Stout's unidimensionality procedure DIMTEST (Stout, 1987). 

Hat tie (1985), Hambleton and Rovinelli (1986), and Berger and Knol (1990) have 
reviewed several procedures for assessing dimensionality, including some of the above 
mentioned procedures. The main focus of this paper is to study and compare some of the 
procedures to assess dimensionality that are most recent, seem promising, and are little 
studied. Four procedures are considered and compared in this paper: DIMTEST, Holland 
and Rosenbaum' s procedure, nonlinear factor analysis, and linear factor analysis. Linear 
factor analysis was used, because of its historical importance, as a benchmark to compare 
other procedures. Several sets of unidimensional and multidimensional test data were 
simulated and used to study the performance of all four procedures for assessing 
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dimensionality. The same procedures were then repeated with real test data. 

Description of Procedures 

Linear Factor Analysis 

Linear factor analysis is the most commonly used approach to assess dimensionality. 
With linear factor analysis, each extracted factor is presumed to represent a dimension, 
and items that loa*d heavily on a given factor are considered good measures of that 
dimension. There are a number of fundamental problems associated with applying linear 
factor analysis to binary data. First, linear factor analysis assumes that the relationship 
between the observed variables and the underlying factors is linear and that the variables 
are continuous in nature. But it is clear for dichotomous data that the relationship between 
the performance and the underlying latent variable is not linear. Hence, applying factor 
analysis to phi or tetrachoric correlations of binary item responses produces difficulty 
factors (Hulin, Drasgow, & Parsons, 1983, chap. 8). Second, in computing tetrachoric 
correlations, the cell entries of the fourfold table for a pair of dichotomous items sometimes 
equal zero, making it difficult to determine an appropriate value for the correlation. Third, 
determination of the number of ?ignificant factors could be problematic. 

In this study the statistical package LISCOMP was used to perform exploratory 
linear factor analysis using tetrachoric correlations. Three different approaches were used 
to determine the number of significant factors: parallel analysis, the chi-square test of 
goodness of fit, and goodness of fit statistics (the means and standard deviations of the 
squares of residual correlations and absolute residuals). 

According to parallel analysis (Humphreys & MontaneUi, 1975), the eigenvalues of 
the given correlation matrix are compared with the eigenvalues of random data. The 
random data consist of binary responses generated with the same number of items and 
examinees as that of the given data. The largest eigenvalue from the random data is used 
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as the cutoff point for eigenvalues from the actual data to dbtermine the number of 
significant factors. That is, the number of eigenvalues of the actual data greater than the 
largest eigenvalue of the random data is taken as the significant number of factors 
underlying the given data. 

The second method used to determine the number of factors was the chi-square test 
of goodness of fit from LISCOMP. The third method involves comparisons of means and 
standard deviations of squares of residuals and absolute values o": residuals after fit of an 
m-factor model v^th the corresponding values from the random data. If the residuals are 
sufficiently "small," then one can regard the fit of the model as "reasonably satisfactory" 
(McDonald, 1981; Hattie, 1985, Hambleton & Rovinelli, 1986; and Berger & Knol, 1990). 

Nonlinear Factor Analysis 

McDonald (1967, 1980, 1982) and McDonald and Ahlawat (1974) have 
demonstrated that applying linear factor analysis to unidimensional binary data yields 
"nonlinear factors" rather than "difficulty factors." Nonlinear factors account for nonlinear 
relationships among the variables by using higher order polynomials in the factor model 
(for example, quadratic and cubic terms). McDonald developed the method of nonlinear 
factor analysis (NLFA) to account for the nonlinearity of the data as an improvement over 
linear factor analysis. The variables in the model can be expressed as polynomial functions 
of latent traits or factors. For example, a two-factor model with linear and quadratic 
terms would be of the following form: 

where Vj denotes the examinee's score on item i, 9-^ and flg denote latent traits, 6^^^ 
denotes the factor loading of the t-th item on the j-th common factor for the k-Xh. degree 



Assessing Dimensionality-<3ompaxison 

element in the polynoniial; denotes the unique factor and denotes the unique factor 
loading for item i. Hambleton and Rovinelli (1986) have demonstrated the use of NLFA to 
assess dimensionality and found it to be a promising method. They, however, caution about 
the enter on for the adequacy of the fit of the model. 

In the present study, NLFA ^-nbodied in the computer program NOFA, developed 
by Etazadi-Amoli and McDonald (1983), was used. The fit of the model is studied just as 
in the case of the linear factor analyses, by comparing the means and standard deviations 
of squared residuals and absolute residuals with the correspopdtng values of random data 
and linear factor analyses. The chi-square statistic values are not available firom NOFA. 

Holland and Rosenbaum's Test of Lack of Fit of a 
Unidimensional, Monotone, and Conditional Independent Model 

Rosenbaum (1984) and Holland and Rosenbaum (1986) have proved theorems 
concerning conditional association that can be applied to assess dimensionality. The basic 
notion in Holland and Rosenbaum' s (H&R) theorems is that if the items are locally 
independent, unidimensional, and the item characteristic curves are monotone, then the 
items are conditionally positively associated. Specifically, the conditional covariances 
between any pair of item response functions of a set of unidimensional dichotomous item 
responses given any function of the remaining item responses will be nonnegative. The test 
of this relationship can be specified as 

H.: Cov (X. X-\ I XJ>0 vs. H.: Cov (X. X.\ Xj < 0 

Conditional associations for each pair of items is tested, given the number-right 
score on the remaining items. The Mantel-Haenszel test (M-H) (Mantel & Haenszel, 1959) 
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is used to test this hypothesis. To perform the M~H test on a given pair of items, a 2x2 
contingency table is constructed for the pair for each of the possible number-right scores 
on the remaining items. The cell values of a 2x2 table for item pair : and jfor examinees 
with total score k {k=l,2,...K) on the remaining items can be denoted as the following: ihe 
number of examinees who got both item i and item j correct (nnj), the number of 
examinees who got both item i and item incorrect (nQoP* number of examinees who 
got item i correct and item ; incorrect (n^Q^, and the number of examinees who got item i 
incorrect and item*; correct (nQj^^. The M-H statistic is then given by 

K 

where n^^, = S n^^^ and E{n^^_^) and Vi^n^.) are the expectation and the variance of 

^(^11+) = L— 



and 



>„j = y!ilt!!t^!il^!ii^ (3) 



The plus subscript in Equations 2 and 3 denotes the summation over that subscript. The 
computed Z-value is compared to the lower tail of the standard normal distribution. A 
statistically significant 2" implies that the pair of items in question are not conditionally 
associated, given the sum of the remaining items and are thus inconsistent with the 
unidimensional model. In this manner, the M-H statistic is computed for aU N{N-l)/2 
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pairs of items, where iV^is the total number of items in a test. If a "large" number of pairs 
are shown not to be conditionally associated, then the unidimensional assumption is 
inappropriate. 

Since H&R approach tests each item pair with significance level a, the simultaneous 
inference for all item pairs can be based on Bonferroni bounds (Holland & Rosenbaum, 
1986, Junker, 1990, and Zwick, 1987). According to Bonferroni bounds, one would accept 
H. if the number of rejections at level a is around ta, where t is the number of tests 
performed, which is equal to N{N-l)/2] one would reject H. if at least one test is rejected 
at level a/t 

Rosenbaum (1984), Zwick (1987), and Ben-Simon and Cohen (1990) have 
demonstrated the application of H&R approach to assess dimensionality. Ben— Simon and 
Cohen found the H&R approach to be conservative and erroneously misclassified nearly 
half of the multidimensional item pools they analyzed as unidimensional. Zwick found 
H&R approach to be consistent with other procedures investigated in assessing 
unidimensionality of NAEP reading data. 



DIMTEST 



Stout (1987) developed DIMTEST to test the hypothesis of essential 
unidimensionality: the existence of one dominant dimension. Nandakumar and Stout (in 
press) further modified and improved the performance of DIMTEST. The improvements 
have lead to the following: a robust procedure against presence of guessing in item 
responses; a better control of the observed level of significance, and greater power; and 
automation of the size of assessment subtests, as described below. The hypothesis to test 
unidimensionality can be stated as 
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HQ:d^l vs. H^:d^l 

where dp denotes the essential dimensionality of the item pool of which the given test 
items are a part. 

In order to apply DIMTEST, it is assumed that a group of J examinees take an 
AT-item test. Each examinee produces a vector of responses of Is and Os with 1 denoting a 
correct response and 0 denoting an incorrect response. It is also assumed that essential 
independence with* respect to some dominant ability 0 holds and that the item response 
functions are monotone with respect to the same dominant ability 0. DIMTEST has 
several steps. These are briefly described here (for details see Stout, 1987; Nandakumar and 
Stout, in press). 

Step 1: The AT items of the test are split into three subtests: ATI, AT2, and PT. 
First, ATI items are selected so that these items all measure the same dominant ability. 
This can be achieved either through factor analysis (FA) or through expert opinion (EO). 
If FA method is chosen, M items with highest loadings on the second factor (before 
rotation) are selected. In this case, the program automatically determines the size Mof 
ATI as a function of the test length and the sample size. If EO is sought, on the other 
hand, it is recommended that, at most, one-quarter of the total items should be selected 
.hat tap the same ability. After selecting items of ATI, items of AT2 are selected, also of 
the same size M, so that items of ATI and AT2 have the same difficulty distribution (for 
details see Stout, 1987). The remaining items {tl=N-2M) form the partition subtest PT. In 
the present study, FA is chosen to select ATI items. For examples where EO is used to 
select ATI items, see Nandakumar (in press). 

When FA is used to select ATI items, the given sample of J examinee responses are 
partitioned into two groups. One group of examinee responses (500 examinees 
recommended) is used for exploratory factor analysis to select ATI and AT2 items, and the 
other group of examinee responses is used to compute the Stout's statistic T. 
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Step 2: The second group of examinees (if the first group of examinees is used for 
FA) are partitioned into iiT subgroups based on their PT score. That is, all examinees 
obtaining the same total score on PT are assigned to the same subgroup k (fe=l,2,...i0. 

Step 3: Within each subgroup A, examinee responses to subtest items ATI and AT2 
are used to compute the unidimensional statistic T given by 



T=(T^-T^M (4) 



where 



9 o 

is computed using items of ATi. The cr^ and cr^^^^ and Sj^ are given as follows. 
The usual variance estimate for subgroup k is given by 

where 



yf' =sf=/ V^' -''^^ =2^/ "^''/'k 

with U^jj^ (1 or 0) denoting the response for item i by examinee jin subgroup A, and Jj^ 
denoting the total number of examinees in subgroup k. The "unidimensional" variance 
estimate for subgroup k is given by 



where 
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And the standard enor of estimate for subgroup k is given by 



l/S 




where 



and 



The computed T-value is referred to the upper tail of the standard normal 
distribution to obtain the significance level. The significant values associated with 
unidimensional tests are expected to be large while the significant values associated with 
multidimensional tests are expected to be within the margin of the specified level of 
significance. 

DIMTEST assesses the degree of closeness of an essentially unidimensional model to 
the model generating the observed data. This is done by splitting the test items into three 
subtests — ATI, AT2, and PT — as described above. When the model underiying the test 
item responses is close to essentially unidimensional, items of ATI, AT2, and PT would all 
be of the same dominant dimension; therefore, the value of the statistic T computed based 
on ATI, AT2 would be "small," leading to the tenability of H^. When the model 
underlying the test responses is not essentially unidimensional, however, items of ATI 
would be dimensionally different from items of AT2 and PT and the value of the statistic 
T will be "large" leading to the rejection of H^. 

DIMTEST has been found to discriminate between unidimensional and 
two-dimensional tests for a variety of simulated tes+. data when the correlation between 
abiUties is as high as .7 (Stout, 1987; Nandakumar & Stout, in press). Nandakumar (1991) 
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has shown the usefulness of DIMTEST to assess essential unidimensionality in the possible 
presence of several minor abilities. The findings indicate that essential unidimensionality is 
established when each of the minor abilities influence relatively few items, or, if minor 
abilities are influencing many items, the strength of the influence of the minor abilities is 
low. As the strength of the minor abilities increases, the approximation to an essentially 
unidimensional model degenerates, inflating the type-I error of the test of hypothesis of 
essential unidimensionality. Nandakumar (in press) has further replicated these findings on 
a wide variety of real test data. This study also demonstrates the sensitivity of DIMTEST 
to major and minor abilities infltiencing item responses. 



Description of Test Data 
The Simulated Test Data 



Seven data sets, DATA1-DATA7, were generated. Of the seven, three data sets, 
DATA1-DATA3, are strictly unidimensional, consisting of 26, 40, and 50 items, 
respectively. The other four data sets, DATA4-DATA7, are two-<iimensional with length 
iV=25 and correlation between abilities p=:.3, ^=25 and p=.7, iV=:50 and p=.3, and iV=50 
and p=.7, respectively. All 7 data sets have 2000 examinees. These data set characteristics 
are summarized in Table 1. 



Table 1 about here 



The unidimensional data sets were generated using the three-parameter logistic 
model given by 

16 
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(5) 



The abilities {9) were fndependently generated from the standard normal distribution, and 
the item parameters {a^,b-,c^) of real tests as described in Nandakumar (1991) were used in 
generating item responses. For example, items of DATA 1 have a larger variability in 
discrimination power (a^, ranging from 1.22 to 2.82; items of DATA 2 have a smaller 
variability of o^s, tanging from 1.07 to 2.00. For each simulated examinee, the probability 
of correctly answering each item, P.(5), was computed using the three-parameter logistic 
model. For each item i, a random number between 0 and 1 was generated from a uniform 
distribution. If the computed probabiUty, P.(5), was greater than or equal to the random 
number generated, the examinee was said to have answered the item correctly and was 
given a score of 1; otherwise the examinee was given a score of 0. The two-dimensional test 
data were generated according to the multidimensional compensatory model (Reckase & 
McKinley, 1983) given by 



' ,. (6) 



The abilities 9 = {0 ,9^ were sampled from a bivariate normal distribution with 
both means zero and both variances one. Two levels of correlation coefficients between the 
abilities were used: .3 and .7. The guessing level was taken to be .20 for all tests. The 
discrimination parameters (Ojpa^) for each item were independently generated as foUows: 



2' ^. 



N 



It £_ 



where /x and a are the mean and standard deviation of the distribution of discrimination 
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paxameters of the respective tuudimensional tests with the same number of items. Similarly 

L . and . were assumed to be independent of each other for each item and were generated 
it 

as follows: 

&li^N(/x, cr), 62i~N(/x, a), 

where /i and a are the mean and standard deviation of the distribution of difficulty 
parameters of the respective unidimensional test with the same number of items. For 
example to generate test data DATA4 with N=25 and p=.3, the means and standard 
deviations of as and 6 5 of item parameters used for DATAl were used. The item responses 
(0,1) were generated exactly as described for unidimensional case by using P.(0 of (6). 

The Real Test Data 

The real test data used in this study came from two different sources. The National 
Assessment of Educational Progress (NAEP, 1988) data for the 1986 US ffistory (fflST) 
and Literature (LIT) for grade 11/age 17 w^e obtained from Educational Testing Service, 
The Armed Services Vocational Aptitude Battery (ASVAB) data for Arithmetic Reasoning 
(AR) and General Science (GS) for grade 10 were obtained from Linn, Hastings, Hu, and 
Ryan (1987). For all data sets, examinees who missed one or more items were deleted from 
the analyses. Test sizes and sample sizes for all real tests are given in bottom half of 
Table 1. Since all four test data were assessed as unidimensional by the methods employed 
in this article (details are provided in Results section), they were combined to form 
two-dimensional tests. Four two-dimensional tests were formed as follows. The test data 
HSTLITl was formed by combining the data of 31 items of HIST with the data of 5 items 
of LIT randomly selected from 30 items. Similarly HSTLIT2 was formed by combining the 
responses of 31 items of HIST with the responses of 10 items of LIT, and the test data GS 
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was formed by combining responses of 30 items of AR with the responses of 10 items of GS. 
The two-<iimensional test HSTGEO contains 31 history items spanning US history from 
the colonization period to modem times (HIST) and in addition contains 5 map items 
requiring the knowledge of geographical location of different countries in the worid. This is 
the actual history test according to NAEP. But it was shown using DIMTEST that the 5 
map items formed a separate dimension significantly different fcom history items 
(Nandakumar, in press). Hence the data on these 5 map items were removed from the 
history test to form HIST with 31 items, and the original history data were treated as a 
natural two-dimensional test. 

Results 



The results of DIMTEST and the H&R approach will be studied together and 
compared because of the similarity in the underlying theory and because both of them are 
statistical tests. Likewise the results of linear and nonlinear factor analysis will be studied 
and compared together. 



The Simulated Test Data 



DIMTEST and FfcR Procedure 

The results of DIMTEST and the H&R approach for simulated data are presented 
at the top of Table 2. For all data sets, the significance levels associated with DIMTEST 
indicate that DIMTEST is able to correctly confirm unidimensionality and detect lack of 
unidimensionality for both correlation (between abilities) levels p=.3 and p~.l. For 
example, all three unidimensional data sets, DATA1-DATA3, have small T-values and 
large significant values, implying the acceptance of the null hypothesis of essential 
unidimensionality (here the data were simulated as strictly unidimensional). 
Two~<iimensional data, DATA4-DATA7, on the other hand, have large T-values, strongly 
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rejecting the null hypothesis of essential unidimensionality. 



Table 2 about here 



The results of the H&R approach indicate that for unidimensional tests, the number 
of significant negative partial associations at level a (a=.05) are far below the expected 
number {ta), strongly confirming the unidimensional nature of these data sets. Among the 
two-dimensional data sets, D ATA4 and DATA6 (p=.3) were correctly assessed as 
multidimensional. For these data, the number of significant negative partial dissociations at 
level a were beyond ta level, and the number of significant negative partial associations 
beyond level a/t were 15 and 1, respectively, identifying them as multidimensional. The 
test data DATA5 and DATA7 (p=.7), on the other hand, were assessed as unidimensional. 
For DATA5 and D ATA7, the number of significant negative partial associations at level a 
were within ta level, and the number of significant negative partial associations beyond 
level a/t was zero, making them unidimensional tests. It was disappointing to note that for 
many of the item pairs measuring different traits, in two-dimensional tests, the covariance 
did not approach significance. One reason for this could be the noise in the conditional 
score. More research is necessary to draw definite conclusions. 
Linear and Nonlinear Factor Analysis 

The computer programs used to do the analyses, LISCOMP and NOFA, are heavily 
computationally intensive and consume enormous CPU time. In addition, LISCOMP can 
not handle more than about 40 variables. For these reasons, not all data sets were included 
in the linear factor analyses, but all data sets were included in the nonlinear factor 
analyses. The results of linear and nonlinear factor analyses are presented in Table 3. 
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Table 3 about here 



Based on parallel analyses, one factor would be retained for DATAl, DATA2, and 
DATA5; two factors would be retained for DATA4. Whereas, according to the significance 
levels associated with a chi-square test of goodness of fit, in Table 3, a two-factor model 
fits DATAl, a foux-factor model fits DATA2 and DATA4, and a three-fector model fits 
DATA5. Similar chi-square values are not available for nonlinear models. 

The goodness of fit statistics — ^the means and standard deviations of squared 
residuals and absolute residuals — are reported for all data sets in Table 3. The top entry in 
Table 3 refers to random data (RANDOM) with 25 variables and 2000 examinees. Because 
of the cost of computations, only one random data set was used to compare the goodness of 
fit statistics. Comparing goodness of fit statistics of RANDOM with DATAl, it appears 
that both one-factor quadratic and one-factor cubic models fit as well as the four-factor 
linear model. However, since the differences in the magnitude of residuals among models 
are small, one could argue that four-factor linear and one-factor quadratic or cubic models 
are o\^i fit and that one should go with a more parsimonious model. Observance of the 
significance values of the chi-square test of goodness of fit indicates that the two-factor 
model fits the data. If one strictly applies the criterion of using random data residuals as a 
guide to determine the number of factors, however, a one-factor model with a quadratic 
term seems to be the right choice. Similar observations can be made for DATA2. 
Comparing goodness of fit statistics for linear and nonlinear factor analysis, it can be seen 
that for DATA4 and DATA5, the two-factor quadratic model fits better than the 
three-factor linear model, confirming the two-dimensional nature of data. Here again one 
could argue, based on the absolute residuals, that the differences in the residuals are small 
and that the quadratic models or three-factor and four-factor linear models are an over fit. 
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The significant values associated with the chi-square test indicate overestimation of factors 
for D ATA4. As expected, the means and the standard deviations of squared residuals and 
absolute residuals are much larger for DATA4 (/?=.3) than for DATA5 (/?=.7), reflecting 
more deviation from unidJmensionality for DATA4. For DATA5, the goodness of fit 
analyses support a one— factor quadratic model. Likewise the two— factor quadratic model 
fits DATA6, and one-factor quadratic model fits DATA7. 

In summary, there are many criteria that can be used to assess dimensionality by 
linear factor analysis approach. The different criteria may give rise to different condusions 
regarding the dimensionality of the data set in consideration. In the present study it is 
shown that the significant values associated with the chi— square test overestimated the 
number of factors in most cases. Parallel analyses correctly identified the dimensionality in 
some cases. Nonlinear factor analyses exhibited a better fit than the linear factor analyses. 
DIMTEST and H&R procedures were excellent in confirming unidimensionality. 
DIMTEST demonstrated greater power in detecting multidimensionality for correlations 
between abilities as higJi as .7. H&R and nonlinear factor analysis methods demonstrated 
good power provided the correlation between abilities was low (p=.3). 

The Real Test Data 

DIMTEST and HfcR Procedure 

The results of DIMTEST and H&R for real data sets are presented at the bottom of 
Table 2. For data sets LIT, HIST, AR, and GS, the T-vaiues associated with DIMTEST 
indicate that these data can be approximated by an essentially unidimensional model. The 
results of H&R approach for these data are also consistent with DIMTEST results in thrt 
the number of significant negative partial associations, for each one of the tests, is less than 
the nominal level ta. While both approaches strongly support that HIST, AR, and GS are 
essentially unidimensional, the decision is not clear for LIT because there is one negative 
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partial association that is significant beyond level a/i, and the T-value of DIMTEST is in 
the border line region, indicating presence of violations to the unidimensionality 
hypothesis. 

For two-^mensional data HSTLITl, HSTLIT2, ARGS, and HSTGEO, the 
T-values associated with DIMTEST strongly indicate the multidimensional nature of these 
data. Relatively large T-^values associated with ARGS and HSTGEO indicate that abilities 
within these tests are more orthogonal than abilities in HSTLITl and HSTLIT2. The 
results based on B^R approach, however, indicate that all four data sets are 
unidimensional. For each one of the two-^mensional data sets, the numbor of significant 
negative partial associations is well below the nominal level ta, and none of the partial 
associations are significant beyond level a/t Even with a liberal a = .10, the number of 
negative partial associations did not rise above the nominal level for any of the tests. These 
results suggest that the H&R approach lacks power. 

On further examination of EkR results, it was found that the M-H J?-values for 
many of the item pairs, where items were supposed to be measuring different traits, did not 
reach significance level. One explanation for this could be that for these item pairs, the 
conditional score (SXj^), on the basis of which the examinees are classified into different 
groups, may be contaminated with items tapping different abilities. This could be 
especially true for HSTLIT2 and ARGS where one quarter of the test items are from the 
second dominant dimension. Because of the noise ip. the conditional score distribution, the 
covariance of item pairs measuring different abilities may not be exhibiting significant 
negative covariance. A proper conditional score may considerably increase the power of the 
H&R approach. 

Linear and Nonlinear Factor Analysis 

The results of linear and nonlinear factor analysis for a selection of real data sets are 
reported in Table 4. The results are consistent with the simulated test data in that for all 
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cases Eonlixiear factor models fit better than linear factor models. According to the 
chi-^quare test of goodness of fit, the four-factor model was best fitting for all data sets 
where linear factor analysis was performed. Based on goodness of fit statistics, a one— factor 
quadratic model fits LIT, AR, and HSTLITl better than three- or four-factor linear 
models. Since a one-factor quadratic model fits as well as a two-factor quadratic model, a 
more parsimonious model is stiongly recommended in these cases. For HSTLIT2 and 
ARGS, again it appears that a one-factor quadratic model is appropriate. If chi— square 
statistics were avaalable along with the goodness of fit statistics for nonlinear factor 
analyses, it would have aided in the interpretation. 



Table 4 about here 



In summary, for real data sets, the results are somewhat consistent with simulated 
data sets. For data sets assessed as unidimensional by DIMTEST and H&R, the chi-square 
tests based on the linear factor analysis indicated a four— factor model for the same data. 
Although we do not know the true dimensionality of real data, these results suggest that 
linear factor analysis is overestimating the underlying dimensionality. Whereas, the other 
three methodologies were excellent in identifying essential unidimensionality but differed in 
identifying lack of unidimensionality. DIMTEST demonstrated greater power than either 
the H&R or the nonlinear factor analysis methods. It appears that with the appropriate 
conditional score the power of the H&R approach could be improved, and with some type 
of fit statistics and the associated significance levels, the power of nonlinear factor analysis 
could be improved. 
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Discussiou 

Based on this limited study, findings demonstrate that the linear factor analysis 
approach to assessing essential unidimensionaJity is not satisfactory. This finding is 
consistent with the previous research and theory (see for example, Hambleton &; Rovinelli, 
1986; Hattie, 1984). In contrast to linear factor analysis, DIMTEST, H&R, and nonlinear 
factor analysis were each shown to be promising methodologies to assess dimensionality. 

In this 8tu(fy, aU three methodologies exhibited sensitivity to discriminate between 
one— and two-dimensional test data. For simulated unidimensional test data, all three 
procedures were able to confirm unidimensionaJity. For the real data, all three procedures 
were consistent in identifying unidimensionaiity of HIST, AR, and GS. For 
two--<iimensional test data, however, the three procedures differed in their ability to detect 
the lack of unidimensionaiity. DIMTEST rejected the null hypothesis of essential 
unidimensionaiity for all two--dimensional tests: both real and simulated. The H&R 
approach confirmed the lack of unidimensionaiity for two-diinensional simulated tests, 
provided the correlation between abilities was low (p=.3). For simulated test data with 
high correlation between abilities (/?=.7), the H&R approach was unable to detect 
multidimensionaUty. Also, for all two-dimensional real test data, the H&R approach was 
unable to detect multidimensionality. 

The performance of the nonlinear factor analysis methodology was similar to the 
H&R procedure for two-<iimensional data sets. For simulated test data with p=.3, the 
two-factor model with linear and quadratic terms demonstrated adequate fit statistics 
(smaller means and standard deviations of squared residuals and absolute residuals). For 
simulated tests with /?=.?, however, the difference in fit statistics between one-factor and 
two-factor quadratic models was not evident. Similarly for two-dimensional real test data 
HSTLIT2 and ARCS, the difference in fit statistics between one-factor and two-factor 
models with linear and quadratic terms was not evident. The difficulty in deciding about 
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the correct model arises because there is no concrete way of assessing what is meant by 
"sufficiently small" for goodness of fit statistics. 

In this study, the results associated with the H&R approach were consistent with 
the findings of the Ben-Simon and Cohen's (1990) and Zwick's (1987) studies. The number 
of significant negative partial associations for unidimensional tests was far below the 
expected five percent level, making it a very conservative test. Consequently, it did not 
exhibit high power. The reason one observes fewer than the nominal level of negative 
partial associations is that the conditional score used in computing the covariances is not 
perfectly correlated with the latent variable (Zwick, 1987). According to the theorems 
proved by Holland and Rosenbaum (1986), the conditional score used to compute the 
covariances can be any function of the latent trait. An appropriate choice of conditional 
score, therefore, could maximize the power of H&R approach. 

The results of nonlinear factor analyses were consistent with the findings of 
Hambleton and Rovinelli (1986). Factor models with linear and quadratic terms were able 
to fit the data better than models with just linear terms. The problem with nonlinear 
factor analysis is detennining the appropriate number of polynomial terms to retain in the 
model. This problem suggests that some type of adequacy of fit statistics with associated 
sampling distribution would be necessary to aid in assessing the fit of nonlinear models. 

In terms of assessing the degree of multidimensionality, both the DIMTEST and 
nonlinear factor analysis approaches can be useful. The T-values associated with 
DIMTEST and the fit statistics associated with nonlinear factor analysis can be helpful in 
assessing the degree of multidimensionality. For example, both HIST and AR are 
considered as essentially unidimensional data sets, but the associated T-values are -1.53 
and 1.18 respectively. By contrast, for a twc>-dimensional data set HSTLIT2, T=2.03. The 
difference in the T-values mirrors the degree of multidimensionality present in the data. 
Similarly, the difference in fit statistics between one-factor and two-factor quadratic 
models for DATAl and DATA4 reflects the degree of multidimensionality. 
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In the present study, the test length is more than 25 items, and the sample sizes are 
around 2000 examinees. It is not known if the results would hold up for small test lengths 
and sample sizes. De Champlain and GessaroU (1991) have shown that DIMTEST loses 
power when both the test length and the sample size are small (for example, N=25 and 
J=500). Their results show support for the use of incremental fit index (IFI) using the 
nonlinear factor analysis program, NOHARM 11, to assess dimensionality in cases of 
smaller test lengths and sample sizes. Ben-Simon and Cohen (1990) have found that the 
test length and the sample size had a marked effect on the M-H Z-statistic in the 
detection of multidimensionality. In their study they tried test lengths of 20, 30, 40, and 50 
and sample sizes of lOOO, 2000, 3000, and 4000. They found that larger samples and larger 
tests faciUtated the detection of multidimensionality. They urge a cautious interpretation 
of M-H test results in light of test lengths and sample sizes. 

Just as linear and nonlinear methodologies share the same philosophical theory, 
DIMTEST and H&R approaches share the same theoretical framework. The basic rationale 
for the H&R approach is to reject the locally independent, monotone, unidimensional 
model if the conditional covariances are significantly negative. By contrast, DIMTEST 
rejects the essentially independent, monotone, essentially unidimensional model if the 
conditional covariances are significantly positive (it can be shown that the expected value 
of the numerator of Stout's statistic T is mathematically equivalent to average conditional 
covariances among ATI items. Stout (1987)). This apparent contradiction in the criterion 
for assessing unidimensionality may be resolved by noting the subtle difference in item pair 
covariances under consideration. In the H&R approach, one expects the conditional 
covariance between items measuring different traits to be negative; whereas in Stout's 
approach, one expects the asymptotic conditional covariance between items measuring the 
same trait to approach zero. DIMTEST is specifically designed to assess unidimensionality 
and thus looks for the existence of at least two dominant dimensions. By contrast, the 
H&R approach looks at aU item pairs and detects items that are not measuring the same 
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trait as other items of the test. 

As for the computational time involved, DIMTEST is most efficient. The 
computational time involved for other procedures is significantly more. For example, for a 
25 item test with 2000 examinees, DIMTEST uses 4 seconds of CPU time, H&R approach 
uses 24 seconds, and nonlinear factor analysis uses 42 seconds; for a 50 items test with 2000 
examinees, DIMTEST uses 8 seconds, H&ii approach uses 106 seconds, and nonlinear 
factor analysis uses 191 seconds. As the test length increases, the H&R approach requires 
disproportionately more time, and the same is true for the nonlinear factor analysis as test 
length increases and/or the model gets more complex. 
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Notes 



^The reader is reminded that testing for uni dimensionality is not synonymous to testing for 
model-data fit. If a unidimensional model is to be applied to the data, testing for 
miidimensionality is the first step. If item responses are essentiaUy miidimensional, then as 
a second step, one can test for model-data fit, such as, one-parameter logistic, 
two— parameter logistic, etc. 
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Name 



Traits 



Simulated data sets 



Real data sets 

LIT 2439 1 

fflST 2428 1 

AR 1984 1 

GS 1990 1 

HSTLITl 2428 2 

HSTLIT2 2428 2 

ARCS 1853 2 

HSTGEO 2440 2 



Table 1 
Description of Data Sets 



DATAl 


2000 


1 




25 


DATA2 


2000 


1 




40 


DATA3 


2000 


1 




50 


DATA4 


2000 


2 


.3 


25 


DATA5 


2000 


2 


.7 


25 


DATA6 


2000 


2 


.3 


50 


DATA7 


2000 


2 


.7 


50 



30 
31 
30 
25 
36 
41 
40 
36 



Number of items of each trait 
Traitl Trait2 Mixed* 



25 


0 


0 


40 


0 


0 


50 


0 


0 


8 


8 


9 


8 


8 


9 


16 


16 


17 


16 


16 


17 


30 


0 


0 


31 


0 


0 


30 


0 


0 


25 


0 


0 


31 


5 


0 


31 


10 


0 


30 


10 


0 


31 


5 


0 



^ J denotes the number of examinees 
V denotes the correlation between traits 

denotes the test length 
'^mixed items are a combination of both traits 1 and 2 



Table 2 

Results of DIMTEST and H&R Analyses 



DIMTEST 



H.: dgf=l 

a 



Name 



Decision 
based on 
DIMTEST 



No.of 
item 
pairs 
t 



H&R Test 



H.: coiiX.,X.\ S X^>0 



No. of 
pairs 

significant 
at level a 



No.of 
pairs 

significant 
at level a/ 1 



Decision 
based on 
Bonferoni 
bounds 



Simulated test data 



DATAl 


-1.05 


.85 


accept H. 


300 


1 


DATA2 


-0.75 


.77 


accept 


780 


3 


DATA3 


-0.94 


.83 


accept 


1225 


10 


DATA4 


7.19 


.000 


reject 


300 


71 


DATA5 


3.62 


.000 


reject 


300 


10 


DATA6 


10.13 


.000 


reject 


1225 


206 


DATA? 


2.41 


.008 


reject 


1225 


56 



0 

0 
0 

15 
0 
1 
0 



accept H. 

accept 

accept 

reject 

accept 

reject 

accept 



Real test data 



LIT 
mST 
AR 
GS 

HSTLITl 



1.70 
-1.53 

1.18 
-0.14 

3.01 



HSTLIT2 2.03 
ARGS 6.15 
HSTGEO 6.19 



.045 
.937 
.118 
.555 
.036 
.021 
.000 
.000 



accept 

accept 

accept 

accept 

reject 

reject 

reject 

reject 



435 
465 
435 
300 
630 
820 
780 
630 



16 
6 
3 
6 

17 
18 
4 
17 



1 
0 
0 
0 
0 
0 
0 
0 



undecided 

accept 

accept 

accept 

accept 

accept 

accept 

accept 



significant at .05 level 



Table 3 

Results of Linear and Nonlinear Factor Analysis 
For Simulated Test data: Goodness of Fit Statistics 



SD(r.p 



SD(|r^.|) 



EANDOM 



Linear Factor Analysis 

1 Factor 

2 Factor 

3 Factor 

4 Factor 

DATAl * 

Linear Factor Analysis 

1 Factor 

2 Factor 

3 Factor 

4 Factor 

Nonlinear Factor Analysis 
1 Factor Quadratic 

(Y.= b.o+b.i0+b.20=+d.Ui) 

1 Factor Cubic 

(Yi= bio+bii5+bi2^^+b.353+d.u.) 

DATA2 

Linear Factor Analysis 

1 Factor 

2 Factor 

3 Factor 

4 Factor 

Nonlinear Factor Analysis 
1 Factor Quadratic 

(Yj= b.o+b.,0+b.20^+d.u.) 
1 FactOT Cubic 

(Yj= b.Q+b.,5+bi20^-f b.3^^+d.Ui) 

DATA3 

Nonlinear Factor Analysis 
1 Factor Quadratic 

(Yj= h.Q+h^,9+h.J'+d.n.) 
1 Factor Cubic 

{Y.= bio+b.i^+b.20^+b.30^+d.uj) 



.0009 .0308 .0250 .0182 

.0008 .0283 .0225 .0169 

.0007 .0246 .0207 .0160 

.0006 .0245 .0196 .0147 



.0017 


.0412 


.0333 


.0242 


.006 


.0013 


.0359 


.0286 


.0218 


.350 


.0011 


.0332 


.0262 


.0204 


.610 


.0009 


.0303 


.0236 


.0191 


.860 


.0003 


.0185 


.0147 


.0113 




.0003 


.0185 


.0147 


.0113 





.0110 


.1049 


.0982 


.0369 


.000 


.0091 


.0954 


.0896 


.0327 


.000 


.0070 


.0834 


.0774 


.0310 


.000 


.0061 


.0779 


.0720 


.0278 


.000 


.0003 


.0186 


.0148 


.0113 




.0003 


.0185 


.0148 


.0113 




.0003 


.0186 


.0147 


.0115 




.0003 


.0175 


.0138 


.0108 
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.0203 


.1425 


.1108 


.0900 


.000 


.0017 


.0412 


.0334 


.0240 


.000 


.0012 


.0346 


.0276 


.0212 


.008 


.0021 


.0465 


.0523 


.0379 




.0003 


.0171 


.0131 


.0109 





.0047 


.0686 


.0556 


.0409 


.000 


.0014 


.0374 


.0313 


.0218 


.011 


.0012 


.0346 


.0289 


.0199 


.245 


.0010 


.0316 


.0254 


.0181 


.600 


.0009 


.0307 


.0246 


.0186 




.0003 


.0174 


.0138 


.0107 





Table 3 continued... 
DATA4 

Linear Factor Analysis 

1 Factor 

2 Factor 

3 Factor 
Nonlinear Factor Analysis 

1 Factor Quadratic 

(Yi=''io+'>ii«+VM»i) 

2 Factor Quadratic 

DATA5 

Linear Factor Analysis 

1 Factor 

2 Factor 

3 Factor 

4 Factor 
Nonlinear Factor Analysis 

1 Factor Quadratic 

(Yi= bj„+bij9+bi29'+diUi) 

2 Factor Quadratic 

(Yi= bio+^ll^l+^il2^1+^2A+^i22^2+di^i) 
DATA6 

Nonlinear Factor Analysis 

1 Factor Quadratic .0005 .0242 .0204 .0172 

(Yj= h,Q+h.,e+h.J^i-d,r..) 

2 Factor Quadratic .0003 .0182 .0145 .0111 

DATA7 

Nonlinear Factor Analysis 

1 Factor Quadratic .0005 .0223 .0176 .0137 

(Yj= b.o+bi,0+b.20^+d.Ui) 

2 Factor Quadratic .0003 .0175 .0140 .0105 
(Yi=bio+biii5l+bji2^|+bj2,e2+bi22^|+djUi) 



r- are the residual correlations 

p-value associated with the chi-^quare test of goodness of fit. 



36 



Table 4 

Results of Linear and Nonlinear Factor Analysis 
For Real Test data: Goodness of Fit Statistics 



rj* 'sD(r^ Jr^ SD(|r^.l) p<** 



LIT 

Linear Factor Analysis 

1 Factor 

2 Factor 

3 Factor 

4 Factor 
Nonlinear Factor Analysis 

1 Factor Quadratic 

(Yi= bio+bii0+bi2fl^+diUi) 

2 Factor Quadratic 

(Yi=biO+^ll^l+^il2^1+^i2A+^22^2+di^i) 
AR 

Linear Factor Analysis 

1 Factor 

2 Factor 

3 Factor 

4 Factor 
Nonlinear Factor Analysis 

1 Factor Quadratic 

(Y.= h.o+h.J+h.J'+d.,n.) 

2 Factor Quadratic 

HSTLTTl 



.0034 


.0584 


.0465 


.0354 


.000 


.0028 


.0526 


.0428 


.0307 


.000 


.0019 


.0439 


.0349 


.0267 


.000 


.0015 


.0391 


.0310 


.0240 


.000 


.0008 


.0278 


.0216 


.0176 




.0004 


.0207 


.0162 


.0130 





.0047 


.0683 


.0569 


.0378 


.000 


.0032 


.0561 


.0468 


.0310 


.000 


.0024 


.0489 


.0400 


.0281 


.000 


.0020 


.0447 


.0362 


.0262 


.000 


.0007 


.0265 


.0200 


.0174 




.0004 


.0190 


.0146 


.0122 





Nonlinear Factor Analysis 

1 Factor Quadratic .0008 .0275 .0213 .0175 

(Y.= bio+\^e+h.J'+d.u.) 

2 Factor Quadiatic .0003 .0185 .0143 .0118 
(Yi=bio+bin^l+bii2^f+bi2ie2+^i22^2+^23^1^2+diUi) 



3 



ERIC 



ERIC 



Table 4 continued... 
HSTLIT2 

Nonlinear Factor Analysis 

1 Factor Quadratic -0006 .0236 .0181 .0152 

(Yj= b.Q+b.ie+bj2e2^.d.u.) 

2 Factor Quadratic -0004 .0191 .0150 .0119 

(Yi=bio+b.ii01+b.i20j+bi2i02+bi22^|+bi23^1^2+di^i) 
ARGS 

Nonlinear Factor Analysis 

1 Factor Quadratic -0021 .0462 .0268 .0376 

(Y.= \Q+\i0^h.^e'+h.^e.) 

2 Factor Quadratic -0004 .0192 .0003 .0123 

(Yi=bio+biii01+bii2«J+bi2i52+^i22^2+^i23^lWi) 

3 Factor Quadratic -0004 .0175 .0003 .0111 

(Yi=bio+biii0i+bii20f+bi2i^2+^22^2+^31^3+ 
bi32^3+Wl^2+W34^1^3+W35^2^3+Vi) 



r . . are residual correlations 

y 

p-value associated with the chi-«quare test of goodness of fit. 
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